智能论文笔记

Adversarial Deep Learning for Online Resource Allocation

Bingqian Du , Zhiyi Huang , Chuan Wu

分类：机器学习 | 人工智能

2021-11-19

在线算法是算法设计中的重要分支。设计具有有界竞争比率的在线算法（在最坏情况性能方面）可能是艰难的并且通常依赖于特定于问题的假设。由生成对抗净净净（GAN）的对抗训练的启发和在线算法的竞争比率基于最坏情况的输入，我们采用深度神经网络来学习从头开始进行资源分配和定价问题的在线算法对于最坏情况的输入，可以最小化离线最佳和学习的在线算法之间的性能差距的目标。具体而言，我们分别利用两个神经网络作为算法和对手，让他们播放零和游戏，而对验证负责产生最坏情况的输入，而算法基于对手提供的输入学习最佳策略。为了确保算法网络的更好收敛（到所需的在线算法），我们提出了一种新颖的每轮更新方法来处理顺序决策，以便在不同的回合中断复杂依赖性，以便可以为每种可能的动作完成更新，而不是只有采样的行动。据我们所知，我们的作品是首次使用深度神经网络来设计一个在最坏情况性能保证的角度的在线算法。实证研究表明，我们的更新方法确保了纳什均衡的融合，并且学习算法在各种设置下优于最先进的在线算法。

translated by 谷歌翻译

Lower Difficulty and Better Robustness: A Bregman Divergence Perspective for Adversarial Training

Zihui Wu , Haichang Gao , Bingqian Zhou , Xiaoyan Guo , Shudong Zhang

分类：机器学习

2022-08-26

在本文中，我们研究了通过减少优化难度来改善对抗性训练（AT）获得的对抗性鲁棒性。为了更好地研究这个问题，我们为AT建立了一个新颖的Bregman Divergence观点，其中可以将其视为负熵曲线上训练数据点的滑动过程。基于这个观点，我们分析了方法（即PGD-AT和Trades）的两个典型方法的学习目标，并且我们发现交易的优化过程比PGD-AT更容易，而PGD-AT则将PGD-AT分开。此外，我们讨论了熵在贸易中的功能，我们发现具有高熵的模型可以是更好的鲁棒性学习者。受到上述发现的启发，我们提出了两种方法，即伪造和MER，它们不仅可以减少10步PGD对手下优化的难度，而且还可以提供更好的鲁棒性。我们的工作表明，在10步PGD对手下减少优化的难度是增强AT中对抗性鲁棒性的一种有前途的方法。

translated by 谷歌翻译

HTML版本

Contrastive Instruction-Trajectory Learning for Vision-Language Navigation

Xiwen Liang , Fengda Zhu , Yi Zhu , Bingqian Lin , Bing Wang , Xiaodan Liang

分类：计算机视觉 | 自然语言处理 | 机器学习

2021-12-08

视觉语言导航（VLN）任务要求代理商通过自然语言指令的指导到达目标。以前的作品学会在指令后逐步导航。然而，这些作品可能无法歧视跨指令轨迹对的相似性和差异，并忽略子指令的时间连续性。这些问题妨碍了代理人学习独特的视觉和语言表示，损害了导航政策的稳健性和普遍性。在本文中，我们提出了一种对比的指令轨迹学习（Citl）框架，探讨了不同数据样本的不变性，而不同的数据样本和方差以学习强大导航的独特表示。具体而言，我们提出：（1）通过分别对比完整轨迹观测和指示的语义来提高视觉和语言表示来提高视觉和语言。（2）细粒度对比学学习目的，通过利用子指示的时间信息来感知指示; （3）对矿井硬样品对比学学习的成对采样重量机制，从而减轻了数据采样偏差在对比学习中的影响。我们的Citl可以轻松地与VLN骨干网集成，形成新的学习范例，并在看不见的环境中实现更好的普遍性。广泛的实验表明，Citl的模型超越了R2R，R4R和RXR上以前的最先进的方法。

translated by 谷歌翻译

One Proxy Device Is Enough for Hardware-Aware Neural Architecture Search

Bingqian Lu , Jianyi Yang , Weiwen Jiang , Yiyu Shi , Shaolei Ren

分类：机器学习 | 人工智能

2021-11-01

卷积神经网络（CNNS）用于许多现实世界应用，例如基于视觉的自主驾驶和视频内容分析。要在各种目标设备上运行CNN推断，硬件感知神经结构搜索（NAS）至关重要。有效的硬件感知NAS的关键要求是对推理延迟的快速评估，以便对不同的架构进行排名。在构建每个目标设备的延迟预测器的同时，在本领域中通常使用，这是一个非常耗时的过程，在极定的设备存在下缺乏可扩展性。在这项工作中，我们通过利用延迟单调性来解决可扩展性挑战 - 不同设备上的架构延迟排名通常相关。当存在强烈的延迟单调性时，我们可以重复使用在新目标设备上搜索一个代理设备的架构，而不会丢失最佳状态。在没有强烈的延迟单调性的情况下，我们提出了一种有效的代理适应技术，以显着提高延迟单调性。最后，我们验证了我们的方法，并在多个主流搜索空间上使用不同平台的设备进行实验，包括MobileNet-V2，MobileNet-V3，NAS-Bench-201，Proxylessnas和FBNet。我们的结果突出显示，通过仅使用一个代理设备，我们可以找到几乎与现有的每个设备NAS相同的帕累托最优架构，同时避免为每个设备构建延迟预测器的禁止成本。 github：https://github.com/ren-research/oneproxy.

translated by 谷歌翻译

Conditional Diffusion Based on Discrete Graph Structures for Molecular Graph Generation

Han Huang , Leilei Sun , Bowen Du , Weifeng Lv

分类：机器学习

2023-01-01

Learning the underlying distribution of molecular graphs and generating high-fidelity samples is a fundamental research problem in drug discovery and material science. However, accurately modeling distribution and rapidly generating novel molecular graphs remain crucial and challenging goals. To accomplish these goals, we propose a novel Conditional Diffusion model based on discrete Graph Structures (CDGS) for molecular graph generation. Specifically, we construct a forward graph diffusion process on both graph structures and inherent features through stochastic differential equations (SDE) and derive discrete graph structures as the condition for reverse generative processes. We present a specialized hybrid graph noise prediction model that extracts the global context and the local node-edge dependency from intermediate graph states. We further utilize ordinary differential equation (ODE) solvers for efficient graph sampling, based on the semi-linear structure of the probability flow ODE. Experiments on diverse datasets validate the effectiveness of our framework. Particularly, the proposed method still generates high-quality molecular graphs in a limited number of steps.

translated by 谷歌翻译

HUSP-SP: Faster Utility Mining on Sequence Data

Chunkai Zhang , Yuting Yang , Zilin Du , Wensheng Gan , Philip S. Yu

分类：人工智能

2022-12-29

High-utility sequential pattern mining (HUSPM) has emerged as an important topic due to its wide application and considerable popularity. However, due to the combinatorial explosion of the search space when the HUSPM problem encounters a low utility threshold or large-scale data, it may be time-consuming and memory-costly to address the HUSPM problem. Several algorithms have been proposed for addressing this problem, but they still cost a lot in terms of running time and memory usage. In this paper, to further solve this problem efficiently, we design a compact structure called sequence projection (seqPro) and propose an efficient algorithm, namely discovering high-utility sequential patterns with the seqPro structure (HUSP-SP). HUSP-SP utilizes the compact seq-array to store the necessary information in a sequence database. The seqPro structure is designed to efficiently calculate candidate patterns' utilities and upper bound values. Furthermore, a new upper bound on utility, namely tighter reduced sequence utility (TRSU) and two pruning strategies in search space, are utilized to improve the mining performance of HUSP-SP. Experimental results on both synthetic and real-life datasets show that HUSP-SP can significantly outperform the state-of-the-art algorithms in terms of running time, memory usage, search space pruning efficiency, and scalability.

translated by 谷歌翻译

PersonaSAGE: A Multi-Persona Graph Neural Network

Gautam Choudhary , Iftikhar Ahamath Burhanuddin , Eunyee Koh , Fan Du , Ryan A. Rossi

分类：机器学习

2022-12-28

Graph Neural Networks (GNNs) have become increasingly important in recent years due to their state-of-the-art performance on many important downstream applications. Existing GNNs have mostly focused on learning a single node representation, despite that a node often exhibits polysemous behavior in different contexts. In this work, we develop a persona-based graph neural network framework called PersonaSAGE that learns multiple persona-based embeddings for each node in the graph. Such disentangled representations are more interpretable and useful than a single embedding. Furthermore, PersonaSAGE learns the appropriate set of persona embeddings for each node in the graph, and every node can have a different number of assigned persona embeddings. The framework is flexible enough and the general design helps in the wide applicability of the learned embeddings to suit the domain. We utilize publicly available benchmark datasets to evaluate our approach and against a variety of baselines. The experiments demonstrate the effectiveness of PersonaSAGE for a variety of important tasks including link prediction where we achieve an average gain of 15% while remaining competitive for node classification. Finally, we also demonstrate the utility of PersonaSAGE with a case study for personalized recommendation of different entity types in a data management platform.

translated by 谷歌翻译

NEEDED: Introducing Hierarchical Transformer to Eye Diseases Diagnosis

Xu Ye , Meng Xiao , Zhiyuan Ning , Weiwei Dai , Wenjuan Cui , Yi Du , Yuanchun Zhou

分类：自然语言处理

2022-12-27

With the development of natural language processing techniques(NLP), automatic diagnosis of eye diseases using ophthalmology electronic medical records (OEMR) has become possible. It aims to evaluate the condition of both eyes of a patient respectively, and we formulate it as a particular multi-label classification task in this paper. Although there are a few related studies in other diseases, automatic diagnosis of eye diseases exhibits unique characteristics. First, descriptions of both eyes are mixed up in OEMR documents, with both free text and templated asymptomatic descriptions, resulting in sparsity and clutter of information. Second, OEMR documents contain multiple parts of descriptions and have long document lengths. Third, it is critical to provide explainability to the disease diagnosis model. To overcome those challenges, we present an effective automatic eye disease diagnosis framework, NEEDED. In this framework, a preprocessing module is integrated to improve the density and quality of information. Then, we design a hierarchical transformer structure for learning the contextualized representations of each sentence in the OEMR document. For the diagnosis part, we propose an attention-based predictor that enables traceable diagnosis by obtaining disease-specific information. Experiments on the real dataset and comparison with several baseline models show the advantage and explainability of our framework.

translated by 谷歌翻译

Transformer and GAN Based Super-Resolution Reconstruction Network for Medical Images

Weizhi Du , Harvery Tian

分类：计算机视觉

2022-12-26

Because of the necessity to obtain high-quality images with minimal radiation doses, such as in low-field magnetic resonance imaging, super-resolution reconstruction in medical imaging has become more popular (MRI). However, due to the complexity and high aesthetic requirements of medical imaging, image super-resolution reconstruction remains a difficult challenge. In this paper, we offer a deep learning-based strategy for reconstructing medical images from low resolutions utilizing Transformer and Generative Adversarial Networks (T-GAN). The integrated system can extract more precise texture information and focus more on important locations through global image matching after successfully inserting Transformer into the generative adversarial network for picture reconstruction. Furthermore, we weighted the combination of content loss, adversarial loss, and adversarial feature loss as the final multi-task loss function during the training of our proposed model T-GAN. In comparison to established measures like PSNR and SSIM, our suggested T-GAN achieves optimal performance and recovers more texture features in super-resolution reconstruction of MRI scanned images of the knees and belly.

translated by 谷歌翻译

MonoNeRF: Learning a Generalizable Dynamic Radiance Field from Monocular Videos

Fengrui Tian , Shaoyi Du , Yueqi Duan

分类：计算机视觉

2022-12-26

In this paper, we target at the problem of learning a generalizable dynamic radiance field from monocular videos. Different from most existing NeRF methods that are based on multiple views, monocular videos only contain one view at each timestamp, thereby suffering from ambiguity along the view direction in estimating point features and scene flows. Previous studies such as DynNeRF disambiguate point features by positional encoding, which is not transferable and severely limits the generalization ability. As a result, these methods have to train one independent model for each scene and suffer from heavy computational costs when applying to increasing monocular videos in real-world applications. To address this, We propose MonoNeRF to simultaneously learn point features and scene flows with point trajectory and feature correspondence constraints across frames. More specifically, we learn an implicit velocity field to estimate point trajectory from temporal features with Neural ODE, which is followed by a flow-based feature aggregation module to obtain spatial features along the point trajectory. We jointly optimize temporal and spatial features by training the network in an end-to-end manner. Experiments show that our MonoNeRF is able to learn from multiple scenes and support new applications such as scene editing, unseen frame synthesis, and fast novel scene adaptation.

translated by 谷歌翻译